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AN  AI.GORIWUM  FOR  NONCONVEX  PROGRAMMING 
G.  Graves  and  A.  Wliinston 

1.  Introduction 

This  paper  presents  an  algorithm  to  solve  the  most 
general  mathematical  programming  problem 

A 

s.t.  g* (y)  ^0  i  =  1,2, ...,m 

Min.  g(y)  y  =  . ,y^) 

The  only  restriction  required  is  that  the  functions  g^,  g  be 
real  valued.  The  general  formulation  allows  for  nonlinear 
or  linear  integer  programming,  mixed  integer  programming  and 
general  nonemvex  continuous  variable  programming.  The 
extant  algorithms  for  this  most  general  problem  can  usually 
be  viewed  as  local  search  procedures.  They  suffer  from  two 
serious  difficulties  which  can  be  characterized  as  the  "dimen¬ 
sionality  problem"  and  the  problem  of  "trapping  at  local 
optima".  These  difficulties  are  illustrated  by  the  "local 
corner  search"  where  each  of  the  2" 
adjacent  corners  of  a  current  point 
are  evaluated  and  the  best  of  these  is  used  as  the  next 
current  point.  The  number  of  function  evaluations  increase 
exponentially  with  the  number  of  variables  and  the  procedure 
is  impossible  except  for  problems  with  very  few  variables. 

As  is  well  known,  this  procedure  stabilizes  at  local  optima. 
Traditionally,  convexity  is  invoked  by  mathematicians  to 
eliminate  this  sort  of  unpleasantness.  As  a  practical  maLcer 
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with  real  problems,  convexity  is  never  established.  In  fact, 
the  essence  of  location  of  facilities  problems  is  precisely 
the  tradeoff  between  the  economies  of  scale  in  production 
and  the  transportation  cost.  (Edonomies  of  scale  imply 
minimization  of  concave  functions.) 

The  classical  approaches,  then,  have  been  essentially 
"local"  or  "neighborhood"  techniques  dependent  on  derivatives 
(or  finite  difference  approximations  to  derivatives) .  Only 
unrealistic  assumptions  such  as  "convexity"  or  vague  arm 
waving  such  as  "try  a  representative  sample  of  starting  points" 
have  been  advocated  to  deal  with  the  global  problem.  (Obtain¬ 
ing  a  "representative  sample  of  starting  points"  is  feasible 
with  small  generally  artificial  examples.)  We  feel  this 
sweeps  the  very  quintessence  of  many  economic  problems  under 
the  rug.  Our  central  aim  here  is  to  present  a  new  framework 
for  reaching  global  optimum.  The  procedure  involves  two 
interconnected  mechanisms,  a  method  for  structuing  the  search 
and  a  decision  rule  for  selecting  the  course  of  the  search. 

2.  Structuring  the  Search 

Structuring  the  search  consists  of  introducing  a 
framework  for  reducing  the  general  problem  to  that  of 
"implicit  enumeration"  [1].  In  general,  given  a  bounded 
domain  P,  it  can  be  symetrically  partitioned  into  subdomains 
**1^**2' *  * For  example. 
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Technically; 

given  b(i)  ^  y(i)  ^  S(i) 
define  r(i)  =  (s(i)  -  b(i))/2 
y(i)  =  b(i)  +  r(i) 

and  introduce  the  class  C  of  finite  maps 
(u*  (!>•••  ,n]  -♦  (0,1). 

Now  a  i-1  correspondence  can  be  setup  between  the  components 
of  the  partition  of  P  and  the  class  of  maps  C  by  defining 
the  upper  and  lower  bounds  of  a  component  in  terras  of  a  map 

I*(i,(u(i))  =  y(i)  ~  (l-a>(i)  ) -r  (i) 

U(i,u)(i))  =  y(i)  +  a)(i)'r(i) 

To  illustrate  these  formulas,  we  can  apply  them  to  the  two 
dimensional  unit  square,  in  this  event, 

0  i  y{i)  i  1  i  =  1,2 

e.g.  b(l)  =  0  S(l)  =  1 

b(2)  =  0  S(2)  =  1 

r(l)  =  1/2 
y(l)  =  1/2 


and 


r(2)  =  1/2 
y(2)  =  1/2 
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Using  these  quantities: 

L{l,u)(l))  =  1/2  -  (l-u)(l)) -1/2 

U(l,u)(l))  =  1/2  +  (u(l)-l/2 

and 

L{2,u)(2))  =  1/2  -  (l-a)(2))  *1/2 

U(2,(u(2))  =  1/2  +  a)(2)*l/2. 

The  choice  of  any  of  the  four  different  maps  (coCD,  u)(2)) 
specifies  a  particular  rectangle. 


«(1)  =  0 
0(2)  =  1 


Ihis  map  specifies  rectangle  3, 

L(l,(i)(l))  =  0  U(l,u)(l))  *=  1/2 

L(2,«(2))  =  1/2  U(2,(ij{2))  =  1. 

The  problem  is  now  reduced  to  choosing  a  desirable 
map  and  further  refining  the  corresponding  component  until 
a  point  is  specified  to  any  predetermined  accuracy. 

Technically  this  can  be  setup  recursively  by  taking, 
r°(i)  =  (S(i)  -  b(i))/2 
P°(i)  =  b(i)  +  r®(i) 
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and 


r^(i)  =  {i)/2 


y^(i)  =  -  (l-aJ^"^Ni))*r^(i)  +  ob*  T^d) 

til 

and  again  for  any  map  lu  at  the  i  stage ^ 

L^(i,(i,(i))  =  y*'(i)  -  (l-u,(i)) -r^Ci) 

U*=(i,(u(i))  =7*'(i)  +  a)(i)-r^{i). 

Now  specifying  a  sequence  of  maps 
^  o  1  2  \ 

V  U)^  t  Ci)^  j  U)^  s  •  •  •  ) 

specifies  a  sequence  of  nested  intervals  for  each  i 


[L*^(i,u)°(i),  U^(i,u,°{i)l 


such  that 


L  (i,co*(i))-. - -  (monotonically  increases  with  t) 

t  t 

U  (i,  0,*  (i)  )'-~~^^(monotonically  decreases  with  t) 
and  [U^(i,u,J(i))  “  I.^(i,,„J(i))l  «  (S(i)  -  b(i))/2^‘^^ 

-«  0  as  t  -•  «. 

Therefore  a'^^quence  of  maps  (ou*  ^  •  •  • )  defines 

an  n-tuple  of  real  numbers  or  a  point  in  R^.  (Recall  the 

Weirstrauss_Heine  development  of  the  real  numbers.  Their 

definition  is;  "A  real  number  is  a  nest  of  intervals  (x  ,y  ) 

n  n 

such  that  fx  }  is  mono  tonic  increasing,  fy..)  is  mono  tonic 
n  n 

decreasing,  and  d^  =  (y^  -  x^)  -*  0  as  n  ^  cd.  "  See  Knopp 
[3],  Chapter  1.)  Now  with  any  stipulated  accuracy  for  the  solution 
y*(i)  ♦  €  take  the  first  positive  integer  T  such  that  (S(i)  -  b(i))/2^'^ 

<  €  for  all  i  or  >  (S(i)  -  b(i))/6. 
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Introduce  the  class  C  of  meta-maps 

5:  Cl,2, . . . ,nxT}  -  (0,1). 

The  choice  of  a  5  determines  a  "quantitized"  point  in  the 
domain  of  interest.  The  problem  is  reduced  to  choosing  the 
optimal  meta-map  The  algorithm  we  propose  is  to  implicitly 

enumerate  the  class  C  of  meta-maps.  There  are  of  course  many 
other  ways  of  "quantitizing"  the  domain  suitable  for  implicit 
enumeration.  The  employment  of  the  present  structure  and, 
in  particular,  the  T  sub-maps  (u)  ,  ou  , .  - . ,  u)  )  to  specify  $  is 
to  isolate  for  easy  exploitation  the  nested  components  of 
the  successive  partitions  identified  by  the  uj^.  ^  is  these 
nested  components  that  allow  us  to  introduce  set  functionals 
for  decision  making  and  a,  global  approach  to  calculating 
the  optimum  independent  of  such  restrictions  as  convexity 
on  the  original  functions. 

3.  Decision  Rules  for  Directing  the  Search. 

The  most  common  set  functional  in  mathematics  is  the 
ordinary  integral.  It  is  our  contention  that  use  of  this 
functional  instead  of  resorting  to  the  derivative  or  its 
finite  difference  counterpart  of  the  "local"  procedures 
should  encUble  us  to  utilize  global  information.  Liberating 
our  decision  process  from  the  myopic  local  neighborhood 
processes  should  render  us  insensitive  to  trapping  at  local 
optima  and  enable  us  to  dispense  with  inapplicable  mathemati¬ 
cal  assumptions  such  as  "convexity".  The  most  elementary 
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use  of  the  integral  would  be  to  simply  calculate  for  each 
component  (defined  by  an  element  u,^  of  the  meta-map)  the 
following  quantities: 


AV((u  ,g)  = 


Trr  (i) 

i 


jU^(l,m(l)) 

L^(l,u)(l) 


L^(n,uj(n)  ^ 


. ..dy^ 


SS(a)^,g)  = 


rd))  _  jU^(n,u){n))g2 
,d))  “‘L^(n,u)(n)) 


(y)dy, ...dy 


n 


SGM(„„g)  =  [(SS(a,Sg)  -  AV^  (u,^,g)  ] 
d(uj^,g)  =  AV((0,g)  -  vSGM(cu,g) 


The  component  of  the  meta-map  chosen  would  be  such 

that  d(u)5,g)  =  min  d(u)^,g). 

U)€C 

The  decision  functional  d(u)^,g)  is  a  simple  estimator  of  the 
minimum  value  of  the  function  g(y)  on  the  associated  com¬ 
ponent  of  the  partition.  If  no  knowledge  of  the  underlying 
distribution  is  available,  the  parameter  v  in  the  definition 
of  d(u}^,g)  would  have  to  be  determined  empirically  or  several 
runs  made  using  various  values. 

This  simple  procedure  suffers  from  the  same  "dimen¬ 
sionality  problem"  as  the  local  search  procedures.  The 
evaluation  of  the  decision  functional  d((B^,g)  for  all  possible 
2*^  maps  o>^  would  impose  an  intolerable  computational  burden 
(except  for  artificial  mathematical  examples) .  This 
"dimensionality  problem"  can  be  eliminated,  however,  by 
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rosortimj  Lo  an  n-stage  sequential  decision  process.  The 

total  map  u)^  would  be  constructed  in  n-steps  by  sequentially 

fixing  elements  of  the  map.  Suppose  an  arbitrary  set  of  k 

out  of  the  possible  n  elements  of  the  domain  are  fixed.  At 
s  t 

the  (k+1)  step  an  additional  element  of  the  domain,  say 
is  choosen  and 

^+1  -  °  ^+1  -  1* 

Now  if  the  order  of  fixing  elements  of  the  domain  is  com¬ 
pletely  arbitrary,  there  would  be  2  (n-k)  possible  choices 
of  a  couple  each  stage.  The  total 

number  of  functional  evaluations  would  reduce  to 

n-1  n 

2  2  (n-k)  *  2  2  k  *  n(n+l) . 

k=0  k»l 

(This  reduction  is  insignificant  for  3  or  4  variables,  but 
with  as  few  as  20  variables  we  would  achieve  a  reduction 
from 

»  1,048,576 
to 

20-21  =  420.) 

In  the  n-stage  sequential  process,  it  is  necessary 

I 

to  use  a  slightly  more  sophisticated  decision  functional. 
Each  choice  is  now  determined  by  expected  values  over  all 
completions  of  the  k-partial  map.  Given  a  k-partial  map, 
for  example. 


9. 


tixi'd  free 


there  are  2*^  ^  possible  completions.  We  then  employ  the 
following  expected  values  over  the  completion  class  Cj^ 


(AV(u)^g)) 


1 


U^(l,o)(l)) 

L^(1,U)(1)) 


(Jc,u)(k)) 

*L^  (k-l,u)(k)) 


U^(k+l,l) 

L^(k+1,0) 


^(n,l) 

(n,0) 


g(y)dyj^- 


n 


£ 


Ok 


(SS((u^g)) 


2^~^rtr(2)  t'’ (1,^(1)) 


U^(l,a.(l)) 

it , 


U^(k,u)(k)) 

L*'(k,u>(k) 


U^(k+l,l) 

L^(k+1,0) 


tJ^(n,l)  - 

•/t  9  iy)dy^ 

L^(n,0)  ^ 


These  results,  of  course,  rely  on  the  "additivity"  of  the 
limits  of  integration. 


Using  these  more  sophisticated  quantities  we  proceed  as  before 
by  calculating 

(/,g)  =  [E^(SS(u)^,g)  )  -  E^(AV(u,Sg) )  1 


that  is,  the  standard  deviation  of  g(y)  on  the  components 
and 


(AV(a)*^,g))  -  (u),g)  . 

Oje 

(k)  t 

The  decision  functional  d  (u)  ,g)  is  evaluated  for  the  2  (n-k) 
possible  couples,  say  -  1  or 


* 


10. 


"free"  element  of  the  k-partial  map.  The  minimum  value  of 
fki  t 

d  (u>  tS)  determines  the  next  couple  to  be  fixed. 

This  whole  n-stage  sequential  decision  process  is  then 
carried  out  T  times  as  indicated  in  Section  1  to  yield  a 
"point”  in  r”  which  is  hopefully  very  close  to  the  global 
minimum  of  g (y) .  In  any  event,  by  continuing  and  employing 
a  "confidence  level  implicit  enumeration"  (see  [  1  ]  and  [ 2  ] ) 
of  the  whole  class  C  of  meta-maps,  we  should  achieve  a  highly 
sophisticated  search  of  the  whole  domain.  The  only  point  to 
note  in  employing  the  mechanism  of  the  "confidence  level 
enumeration"  is  that  the  recursive  definition  of  the  com¬ 
ponents  would  require  u)^  to  be  entirely  fixed  before  any 
element  of 


4.  Additional  Observations 
(A)  Limiting  Value 

When  the  function  g(y)  is  continuous,  it  might  be 
worth  noting  that 

as  t  -  • 

where  y*  c  R** 

is  the  point  defined  by  the  sequence  of  maps  (ou*,  id*,  . . . . ) . 
Qhis  follows  immediately  from  the  Meeui  Value  Theorem  for 
Integrals  which  says: 


AREA  ^  “ 

where  y  e  D. 


g(y) 
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f  n  ^  t 

Applying  this  result  to  the  terms  of  '  Lw  ,9)  yields 

E_  (AV(m^,g)) g(y*)  as  t  -*  » 

*“11 

(u)^,g)  -  0  as  t  CD 

and  hence 

d^”^  (u>^,g)  ■*  g{y*)  for  any  v  as  t  ..  eo- 

(B)  Indefinite  Integral 

The  evaluation  of  the  integrals  employed  in  the 

(V )  t 

definition  of  the  decision  functional  d  {u>  ,g)  can  be 
carried  out  in  various  ways.  With  continuous  functions,  the 
simplest  procedure  is  to  employ  the  closed  form  given  by  the 
indefinite  integral;  for  example, 

f  vvdvdv  - 

AREA  /  yi^2^^1°^2  “4(Uj^  -  -  ^2^  ~  4 

(C)  Stratified  Sampling 

When  the  function  is  not  known  in  closed  form  or  the 
indefinite  integral  is  not  available,  it  may  become  necessary 
to  resort  to  stratified  sampling  of  the  various  components 
of  the  domain  defined  by  the  limits  of  integration  in  the 
decision  functional;  for  example, 

D 

(Insure  uniform  distribution 
of  points  over  component.) 
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We  could  determine  an  appropriate  sample  size  for  each 
strata  and  on  the  basis  of  this  sample  calculate  estimates 
(AV(ou*^,g))  and  (u»^,g) 

(Ic)  t 

and  from  these  calculate  d  (u)  ,g)  .  At  any  decision  point, 
we  are  stratifying  a  domain  of  the  formt 

L^{l,u)(l))  ^  y^  ^  U^(l,u)(l)) 

•  •  • 

•  •  • 

L*'(k,*u)(k))  ^  yj^  ^  U^CkiiuCk)) 

L^Ck+1,0)  ^  ^  U^{k+l,l) 

•  •  • 

•  •  '  * 

L(n,6)  i  y^  i  U^Cnil)  . 


(D)  Discrete  Variables 

It  is,  of  course,  not  necessary  that  the  variables 
be  continuous.  The  Rieman-'Stieltzes  Integral  is  available 
to  deal  with  discrete  variables.  Recall  the  usual  Unit 


Step  Function 


[0  (y  i  0) 

Ky)  =  ■ 

(i  (y>o)J 

€Uid  Standard  Counting  Measure 


Oj^(y^)  *=  Ky^)  +  Ky^  “  D 


that  would  be  employed  with  zero-one  discrete  variables. 
(A  slight  generalization  would  eliminate  the  reduction  to 
zero-one  discrete  variables.)  In  this  formulation, 

l+€  l+€ 

"AREA"  «=  /  .../  d<y,  ...d»  -  2 

0  0  . 
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and  for  illustrative  purposes,  consider  the  simple  linear  case 
n 

g(y)  =  Z  a.y.. 
i=l  ^  ^ 


Take  S(i)  =  1+e  and  b(i)  =  0 


.  ,  l+€  l+€ 

P  E  (AV(u^  ,g))  =  2  /  /  (aiYi  +  a  y  )da^da2 

_2.C^ 


l+€ 


2  i  ‘>1  ♦  •2y2>'^«2  "  *1  *  ^ 


Hence,  as  expected,  the  decision  of  whether  y^^  0  or  y^^  1 

is  determined  by  whether  a^^  <  0  or  a^^  >  0.  This  general 
approach  reduces  to  techniques  expounded  in  great  detail  in 
[1].  It  should  be  stressed  that  the  Riensn-Stieltzes  Integral  Approach 
developed  in  this  section  is  perfectly  cqpable  of  handling  pure  con¬ 
tinuous  variables,  mixed  continuous  and  integer  variables,  or  pure 
integer  variables. 
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(E)  Constraints 

The  ideas  developed  in  this  paper  can  be  extended  to 
treat  constraints  of  the  form 

g^(y)  ^0  i  *  1,2,. ..,m 

by  introducing  conditional  expected  values.  Ihe  simplest  way 
to  achieve  this  is  through  the  use  of  a  Regression  Equation. 
Instead  of  using  AV(u)^,g),  this  would  require  employment  of: 

,t  .i  t  COV(u)Sg,g^) 

AV(u)  ,(g  g  ))  =  AV((i.\g  )  + - r -  •  (g^  -  AV(u)Sg^), 

Var  (g^) 

the  conditional  expected  value  of  the  function  g  given  a  value 
of  fvmction  g^ .  In  this  procedure,  it  would  be  necessary  to 
estimate  the  maximum  or  minimum  of  (g^  -  AV(u)^,g^))  depending 
on  the  sign  of  the  covariance  on  the  c(»nponents  specified  by 
the  current  k-partial  map  Ihis  could  be  done  in  turn 

in  terms  of  the  variance  of  g^  and  its  mean.  It  would  also 
be  necessary  to  establish  an  appropriate  confidence  level 
that  g^  (y)  ^  0  on  the  component.  When  the  confidence  drops 
too  low  it  is  necessary  to  "backtrack”  in  the  construction 
of  the  meta-map.  It  should  be  observed  that  "normality" 
assumptions  are  not  required  for  this  procedure,  but  in  the 
event  of  non-normality,  the  linear  regression  equation  re¬ 
duces  to  a  first  order  approximation.  Again,  these  ideas 
are  developed  at  greater  length  in  [  1 1 . 
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5.  HBplementation  of  the  Algorithm 

The  algorithm  has  been  programmed  in  FORTRAN  IV  and  several 
exaa^le  problems  have  been  run  on  the  C.D.C.  6^00.  Numerous  approaches 
to  implementing  the  ideas  described  above  are  possible.  The  present 
program  determines  indefinite  integral  and  successively  reevaluates  at 
the  upper  and  lower  bounds  for  the  appropriate  domains  as  specified  in 
the  search  procedure.  This  approach  limits  the  size  of  the  problems 
that  can  be  handled  since  the  storage  requirements  for  the  mean  and 
variance  are  considerable.  For  large  scale  problem  use  of  the  ideas 
discussed  under  stratified  san^ling  or  numerical  approximations  may 
prove  more  effective.  It  should  be  noted  that  the  decision  process 
is  generally  Insensitive  to  the  errors  resulting  from  an  approximate 
determination  of  the  values  of  the  decision  variables. 

Several  problems  obtained  from  the  literature  were  used  to 
test  the  algorithm.  Each  problem  was  run  for  a  succession  of  values 
of  Vj  for  v  =  .5,  .15,  1  and  1.5*  Below  we  list  some  results  obtained 
using  a  C.O.C.  6500.  Since  the  code  was  experimental  a  considerable 
amount  of  time  was  taken  in  calculation  not  directly  used,  including 
the  determination  of  higher  moments.  A  rou^  estimate  would  be  that 
the  xnroblems  could  be  solved  in  one  half  of  the  stated  time.  Further 
precision  could  be  obtained  in  the  answer  by  using  a  gradient  method 
in  the  vicinity  of  the  solution  obtained  by  this  algorithm. 
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o  p 

Problem  1  Min  -  ^Xg  -  ^Xj^  +  -Jxg 

s.t.  2x3^  +  Xg  <  6 

-Xj^  +  Uxg  *  6 

x^  se  0  i  =  1,2 

The  optimal  solution  is  (3,  0)  and  a  value  of  -3  for  the  criterion 
function.  We  obtained  (2.929,  .07812)  and  -2.8627  with  v  =  1.5  in 
5  ceconds  of  computer  time. 

Problem  2  Max  x^^  Xg  Xj 

Xj^  +  2Xg  +  2xj  *  72 

0  *  Xj^  *  42  i  =  1,2,3 

The  answer  is  (24,  12,  12)  with  a  value  of  3,  456  and  we  obtained 
(25.02)  9.844,  13.617)  with  a  value  of  3,  354.  The  value  of  v  was 
15  and  the  coiqmtations  took  about  5  seconds. 
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